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ABSTRACT 



A method of embedding auxiliary information into a set of 
host data, such as a photograph, television signal, facsimile 
transmission, or identification card. All such host data con- 
tain intrinsic noise, allowing pixels in the host data which 
are nearly identical and which have values differing by less 
than the noise value to be manipulated and replaced with 
auxiliary data. As the embedding method does not change 
the elemental values of the host data, the auxiliary data do 
not noticeably affect the appearance or interpretation of the 
host data. By a substantially reverse process, the embedded 
auxiliary data can be retrieved easily by an authorized user. 

10 Claims, 19 Drawing Sheets 
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Computer code for determining pair values. 

for(i = 0; i < (int)bh.colors; { 
intavg; 

if(i % 10 == 0)fprintf(stderr,". M ); 
d .red = colormap[j].r; 
d.grn =colormap[i].g; 
c1.blu = colormap[i].b; 
if(greyscale) { 

avg = (int)(c1 .red + cl .gm + d .blu)/3; 

if(avg==0)continue; 

if(avg!=c1.red II avg!=c1.grn II 
avg!=c1 .blu)contlnue; 

} 

(void)rgbhsi(&c1 , &d1); /* convert to HSI 
components 7 

if((int)d1.inten ==255)unused++; 

old.diff = O.f; 

If((int)d1.inten=0 II (int)d1.inten==(int)bh.colors)contlnue; 

for0'=i+1; j < (int)bh.co!ors; { 

c2.red = colormapO].r; 

c2.grn = colormapO].g; 

c2.blu = colormap[j].b; 

(void)rgbhsi(&c2, &d2); /* convert to HSI 
components */ 

color_diff = d2.hue - d1 .hue; 



Fig.2A 
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I* hue & inten. difference must be ok 7 
if(lgreyscale) { 

if((abs((int)color_diff) < COLOR_NOISE)\ 

&& (color_diff < old_diff)\ 

&& ((int)fabs((double)(d2.inten-d1. inten)) < 
INTENJMOISE) ) { 

if(k>(int)bh.cols/2 -1 )break; 

pair[k].i = i; 

pair[k].j = j; 

pair[k].count = 0; 

k++; 

o!d_diff = color_diff; 

} 

} 

else { 

avg = (int)(c2.red + c2.grn + c2.blu)/3; 
if(avg=0)continue; 

if(avg!=c2.red II avg!=c2.grn II avgl=c2.blu)continue; 
if( (int)fabs((double)(d2.inten-d1.inten))< 
INTENJMOISE && 

(int)fabs((double)(d2.inten-d1 .inten)) !=0) { 
if(k>(int)bh.cols/2 -1)break; 
pair[k].i = i; 
pair[k].j = j; 



Fig.2B 
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pair[k].count = 0; 
k++; 

} 

} 

} r j loop 7 

if(k>(int)bh.cols/2-1)break; 
} /* i loop 7 
no_pairs = k; 



Fig. 2C 
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Computer cj)de to eliminate duplicate pairs. 

for(i=0;kk;i++) { 

pair[i].count += (hist_values[pair[i].i] + 
hist_values[pair[i].j]); 

if(pair[i].i==0 II pair[i].j==0)pair[i].count = 0; 

} 

p_sort(pair, k); 

no_pairs = duplicate (k, pair); 
total = 0; 

for(i=0;kno_jDairs;i++) 

total += pair[i].count; 

total /= 8; 

value = (float)total - (float)NCOLS; 

if(value > O.f) fprintf(stderr,"\n%.1f Kb embedding space 
located", value/1 OOO.f); 

if(value == 0.f)fprintf(stderr f "\nNo embedding space available in 
this image"); 

if(value < O.f) fprintf(stderr,"\nlnsufficient embedding space"); 



Fig. 3 
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Computer code constraining pair values for Truecolor images. 

/* Find histogram point-pairs within RANGE counts, within 
10% in number*/ 

for(ip=0;ip<3;ip++) { 

int nstart; 

long li; 

fprintf(stderr,"\n Analyzing intensity histogram for plane %d", ip); 
for (i=0;i<256;i++) { 

fvalue[i]=(float)hist_values[ip*256+i]; 

} 

nstart = RANGE; 
k = 0; 

while(nstart<256 && k<(int)bh .cols/2) { 
for (i=nstart;i<256;i++) { 
for(j=i-1;j>i-RANGE;j~) { 

li = hist_values[ip*256+i]; 
if((int)(fvalueOrfvalue[i])==0)break; 
if((float)fabs((double)(fvalueD]-fvalue[i]))\ 
< 0.05f*(fvalueD]+fvalue[i])) { 
pair[k].i = i; 
paiifk].j = j; 

pair[k].count = li + hist_values[ip*256+j]; 
k++; 



Fig. 4 A 
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if(k>(int)bh.cols/2-1)break; 
} 

} /* end of inner pixel comparison loop 
0) V 

)flk>(int)bhcolsJ2-1)break; 

} /* end of outer pixel comparison loop (i) 7 

P_sort(pair, k); 
no_pairs = duplicate (k, pair); 
k = no__pairs; 

if(verbose)fprintf(stderr,"%3d pairs\r", k); 
nstart = i; 
} r end of while loop V 



Fig. 4 B 
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Computer code for embedding auxiliary data. 

I* EMBEDDING CODE : 7 

r 

Ij, index over pixels in the image-data 

inrow, index within the image^data row buffer 

nrow, row number in the image-data 

li, index over pixels in the data-image 

d_inrow, " within the data-image row buffer 

k, index within the PAIRS structure array 

maxval, no. of bits embedded 

bitindex, bit position within the data-image byte 

byteplace, position for read/write in tape6 file 

7 

data_row = (unsigned char *)malloc((size_t)NCOLS); 
if(data_row==NULL) { 
pm_error("Data row data allocation failed!"); 
retum(1); 

} 

maxval = bit_piace_index.maxval; 
d_inrow = bit_place_index.d_inrow; 
bit_place_index.li += d_inrow; 
Ij = (long)krow; 
k = 0; 
nrow = -1 ; 

for(li=bit_place_index.li; Iklength-NCOLS; li++) { 

bit_place_index.li = li; 

if(li == -51 2L) { /* header information 7 
byteptr=(unsigned char *)&data_header; 
for(d_inrow=0;d_inrow<sizeof(data_header);d_inraw++) 
data_row[d_inrow]=*(byteptr+djnrow); 

d_inrow = 0; 
} 



Fig. 5 A 
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if((li >= OL && (li % (long)NCOLS) == OL) II reread != 0) 

{ r next row of data-image 7 
j = fgetpos(tape5,&tape5_pos); 
j = fread(data_row, 1 , (size_t)NCOLS, tape5); 
if(lreread) { 

for(i=0;i<j;i++) checksum += data_row[i]; 

djnrow = 0; 

bit_placeJndex.d_inrow = 0; 
} 

reread = 0; /* turn off flag for re-read on next Truecolor 
plane 7 

} 

for (bitindex=bit_place_jndex.bitplace; 

bitindex<NO_BITS; bitindex++) { 
bit_place_index.bitplace = bitindex; 

if((lj-krow) % (long)(BYTES_IN_ROW) == OL) { 
If (n row >= 0) { /* write only after you read */ 
inrow = fseek(tape6, byteplace, SEEK_SET); 
inrow =fwrite(image_row, 1 ,(size_t) 

(BYTES_IN_ROW),tape6); 
byteplace += inrow; 
byteplace += pad; /* skip pad bytes 7 
inrow = krow; 

} 

if(lj/(long)OFFSET == OL II 
(lj+(BYTESJN_ROW+pad))/ 
(BYTES_IN_ROW+pad) > (unsigned \long)bh.rows) 



if(bailout()) { /* end of 
image-data-user 
termination */ 
i = 1; 



Fig. 5 B 
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goto QUIT; 
} 

if (k==no_pairs)goto PLANE; /* next plane of 
image 7 

Ij = krow; 7* pick next pair and start over 7 
pvalue.i=(unsigned int)pair[k].i; /*zero7 
pvalue.j =(unsigned int)pair[k].j; I* one 7 
if(verbose && k>0) fprintf^tderr," %ld 

pvalue.count); 
pvalue.count = 0L; 

If(verbose) fprintf(stderr,"\rEmbedding Pair 

%2d\ (%3d,%3d)'\\ 

k, pvalue.i, pvalue.j); 
else fprintf(stderr,". n ); 
k++; 

byteplace = bh.pixeloffset; 
} 

inrow = (int)((lj-krow)/((long)BYTESJN_ROW+pad)); I* 

read next row 7 
inrow = fseek(tape6, byteplace, SEEK_SET); 
inrow = fread(image_row,1 ,(size_t) 

BYTES_IN_ROW,tape6); 
inrow = krow; 
} /* end new row (Ij) test 7 
/* Embed one byte 7 

if(ip>=0&&pair[k-1].count==0) { /* finished a pair 7 
Ij +=OFFSET; 
inrow += OFFSET; 
bjtindex-; 
continue; 
} 



Fig. 5 C 
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if((int)image_row[inrow]==pvalue.i) { /* find a zero 
value */ 

if(test((int)data_row[d_inrow],bitindex)) 

image_row[inrowJ=(unsigned char)pvalue.j; 

maxval++; 
if(pair[k-1].count=0) { 

pm.errorC'VnPair count error!"); 

i = 1; 

goto QUIT; 
} 

pair[k-1].count~; 
pvalue.count++; 

lf(bitjndex==NO_BITS-1)bit_place_index.bitplace 

=0; 

} 

else bitindex--; f haven't got this bit yet! */ 

lj+=OFFSET; 

inrow+= OFFSET; 

} r end of bitindex loop */ 

d_inrow++; 

} r end of li (data index) loop */ 



Fig. 5 D 
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Computer code to analyze lengths of runs. 

int rowstats(unslgned char *data_row, long histogram, 
int ncols, int packet_size) { 

int i, j, k, I; /* loop counters 7 

int runs=0; /* return value */ 

int count; /* no. of pixels in the run 7 

char letter = 'A'; /* starting code for flagging runs 

in the row 7 

unsigned char block[MAXRUN+3]; I* a block 

containing the run being 
examined 7 

/* find first bit in the row & adjust as a packet flag 7 
if(packet_size >=0) { 

j = packet_col(data_row, packet_size, ncols); 
} 

if (ncols <=0) retum(-1); 

for(i=MINRUN;i<=MAXRUN;i+=2) { I* i is the runlength being 
searched 7 

k = 0; 

forQ=1 ;j<ncols;j++) { /* NOTE: data_row[0] is assumed 
to be zero!! 7 

if(data_row[fl=(unsigned char)ONE) { 

if(data_row[j-1]!=(unsigned char)ZERO) continue; 



Fig. 6 A 
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k = j; /* a block start 7 

} 

else continue; 
/* find a block of data ending with a zero pixel 7 
if(k+i+2 > ncbls) break; 
for(l=k;l<k+i+3;l++) block[l-k] = data_row[l}; 

i=i; 

if(block[i+1] != (unsigned char)ZERO) goto NEXT; 

if(block[i+2] > (unsigned char)ONE ) goto NEXT; 
I* examine block for pixel count 7 
count = 0; 

for(l=0;l<i;l++) { /* all but last bit in block must =17 

if(block[l]=(unsigned char)ONE) count++; 

} 

count++; 
I = j+1; 

if (count = i+1) { /* set all but last pixel in run to flag 

value 7 

if(histogram != NULL)histogram[i]++; 

runs++; 

for(l=j;l<j+count-1;l++) data_row[l] = letter; 

l-H- 
} 

NEXT: forO=l+1;j<NCOLS;j++) if(data_rowO]=(unsigned 
char)ZERO)break; 



Fig. 6 B 
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} /* end of row fl) loop 7 
letter++; 

} /* end of run (I) loop 7 



return (runs); 

} 



Fig. 6 C 
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Computer code to set packet-start pixel flag. 

int packet_col(unsigned char *data_row, int packet_size, int 
ncols) { inti; 

/* find first bit in the row & adjust as a packet flag 7 

for(i=1 ;i<ncols;i++) { 

if(data_row[i]==(unsigned char)ZERO) { 

if(packet_size<0) break; 

if(packet_size>0) { /* first bit set to an even 

column 7 

if (i%2 == 0)break; 

data_row[i] = (unsigned char)ONE; 

} 

else { /* first bit set to an odd column 7 
if(i%2 != 0)break; 

data_row[i] = (unsigned char)ONE; 
} 

} 

} 

if(i=ncols)return(-1 ); /* no black pixels in the row 7 
if(packet_size>=0) return(i); /* index of the first black pixel 7 
if(i%2) return(1 ); /* if (packet_size==-1 ) return odd 7 
else return(O); /* return even 7 

} 



Fig. 7 
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Computer code to embed data in two-color images. 

READLINE: nrow = (int)(lj/((long)bh.cols)); /* 
read data from next 

row */ 

if(verbose) { 

if(nrow==0)fprintf(stderr, u \n n ); 
fprintf(stderr,"\rrow %4d", nrow); 
} 

else motion (stderr); 
bit_count = 0; 

image_row[0] = 0; /* row buffer always starts 

with a zero 7 

if(verbose==2 && nrow <=61)fprintf(tape9,"\nnrow 
byteplace %d %ld", nrow.byteplace); 

inrow = fseek(tape6, byteplace, SEEK_SET); 

writeplace = byteplace; 

for(j = 1 ; j < (int)bh.cols+1 ; { 

int pix; 

if(b'rt_count <= Q) { /* need another byte 7 
bit_count = 8; 

bit_store = pbm_getrawbyte(tape6); 

byteplace++; 

} 



Fig. 8 A 
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bit_count -= bh.bitsperpixel; 

pix = ( bit_store » bit_count ) & mask; 

image_row[j] = (unsigned char)pix; 

#ifdef INSERT_KEY 

/* key row set to zero to hold key pairs 7 

if(nrow = KEYLINE)image_row[j] = (unsigned 
char)ZERO; 

#endif 

}/* cols 7 

byteplace += pad; 

j = packet_col(image_row,-1 ,(int)bh.cols); 

i = rowstats(image_row,NULL,(int)bh.cols+1 ,-1 ); /* 
flag the embedding pixels V 

if(verbose==2) fprintf(tape9,"\n nrow.i.j: %d %d 
%d", nrow,i,j); 

if(j<0 II i==0) { I* a row of white pixels or no 
pixels for embedding */ 

if(nrow+1<(int)bh.rows) { 

Ij += bh.cols; 

goto READLINE; 

} 

"} 

if(j==1 && kp=0) { 



Fig. 8 B 
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fprintf(stderr,"\nPacket start-index error, 
packet %d" f packet_no-1 ); 

goto QUIT; 
} 

if(j=0 && kp > 0) { 

. fprintfCstderrAnContinuation packet-index 
error, packet %d", packet_no-1 ); 

goto QUIT; 
} 

inrow = 1 ; 

if (kp==0 && verbose==2) 

fprintf(tape9,"\nPacket start-row %d, bits found 
%d",nrow,i); 

kp++; 

} r end new row (Ij) test 7 
/* Embed one byte, use all pairs for each row */ 
for(k=0;k<no_pairs;k++) { 
if(pair[k].count<0) { 

pm_error("\nPair count error!"); 

goto QUIT; 
} 

testltr = (unsigned char)(letter+(unsigned char)pair[k].i/2 - 
1); r flag letter V 



Fig. 8 C 
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if(image_row[inrow]=testltr) { /* find a flagged run 
7 

if(verbose==2 && nrow=60) fprintf(tape9,"inrow %d", 
inrow); 

inrow += (unsigned int)pair[k].j; 
Ij += pair[k].j; 

if(test((int)packet[inpacket_row],bitindex)) image_row 
[inrow-1]=1; 
else image_row[inrow-1]=0; 



Fig. 8 D 
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DATA EMBEDDING 

The present invention generally relates to digital manipu- 
lation of numerical data and, more specifically, to the 
embedding of external data into existing data fields. This 
invention was made with Government support under Con- 
tract No. W-7405-ENG-36 awarded by the U.S. Department 
of Energy. The Government has certain rights in the inven- 
tion. 

FIELD OF THE INVENTION 

The use of data in digital form is revolutionizing com- 
munication throughout the world Much of this digital 
communication is over wire, microwaves, and fiber optic 
media. Currently; data can be transmitted flawlessly over 
land, water, and between satellites. Satellites in orbit allow 
communication virtually between any two points on earth, 
or in space. 

In many situations, it may be of benefit to send particular 
secondary data along with the primary data. Secondary data 
could involve the closed captioning of television programs, 
identification information associated with photographs, or 
the sending of covert information with facsimile transmis- 
sions. Such a technique is suited also for use as a digital 
signature verifying the origin and authenticity of the primary 
data. 

Data in digital form are transmitted routinely using wide 
band communications channels. Communicating in digital 
fashion is facilitated greatly by error-correcting software and 
hardware protocols that provide absolute data fidelity. These 
communication systems ensure that the digital bit stream 
transmitted by one station is received by the other station 
unchanged. 

However, most digital data sources contain redundant 
information and intrinsic noise. An example is a digital 
image generated by scanning a photograph, an original work 
of electronic art, or a digitized video signal In the scanning 
or digital production process of such images, noise is 
introduced in the digital rendition. Additionally, image 
sources, such as photographic images and identification 
cards, contain noise resulting from the grain structure of the 
film, optical aberrations, and subject motion. Works of art 
contain noise which is introduced by brush strokes, paint 
texture, and artistic license. 

Redundancy is intrinsic to digital image data, because any 
particular numerical value of the digital intensity exists in 
many different parts of the image. For example, a given 
grey-level may exist in the image of trees, sky, people or 
other objects. In any digital image, the same or similar 
numerical picture element, or pixel value, may represent a 
variety of image content This means that pixels having 
similar numerical values and frequency of occurrence in 
different parts of an image can be interchanged freely, 
without noticeably altering the appearance of the image or 
the statistical frequency of occurrence of the pixel values. 

Redundancy also occurs in most types of digital 
information, whenever the same values are present more 
than once in the stream of numerical values representing the 
information. For a two-color, black and white FAX image, 
noise consists of the presence or absence of a black or white 
pixel value. Documents scanned into black and white BIT- 
MAP® format contain runs of successive black (I) and 
white (0) values. Noise in these images introduces a varia- 
tion in the length of a pixel run. Runs of the same value are 
present in many parts of the black and white image, in 
different rows. This allows the present invention also to be 
applied to facsimile transmissions. 
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The existence of noise and redundant pixel information in 
digital data permits a process for implanting additional 
information in the noise component of digital data. Because 
of the fidelity of current digital communication systems, the 

. 5 implanted information is preserved in transmission to the 
receiver, where it can be extracted. The embedding of 
information in this manner does not increase the bandwidth 
required for the transmission because the data implanted 
reside in the noise component of the host data. One may 

10 convey thereby meaningful, new information in the redun- 
dant noise component of the original data without it ever 
being detected by unauthorized persons. 

It is therefore an object of the present invention to provide 
apparatus and method for embedding data into a digital 

15 information stream so that the digital information is not 
changed significantly. 

It is another object of the present invention to provide 
apparatus and method for thwarting unauthorized access to 
information embedded in normal digital data. 

20 Additional objects, advantages and novel features of the 
invention will be set forth in part in the description which 
follows, and in part will become apparent to those skilled in 
the art upon examination of the following or may be learned 
by practice of the invention. The objects and advantages of 

25 the invention may be realized arid attained by means of the 
instrumentalities and combinations particularly pointed out 
in the appended claims. 

BACKGROUND OF THE INVENTION 

30 Additional objects, advantages and novel features of the 
invention will be set forth in part in the description which 
follows, and in part will become apparent to those skilled in 
the art upon examination of the following or may be learned 
by practice of the invention. The objects and advantages of 

35 the invention may be realized and attained by means of the 
instrumentalities and combinations particularly pointed out 
in the appended claims. 

SUMMARY OF THE INVENTION 

40 In accordance with the purposes of the present invention 
there is provided a method of embedding auxiliary data into 
host data comprising the steps of creating a digital repre- 
sentation of the host data consisting of elements having 
numerical values and containing a noise component; creat- 

45 ing a digital representation of the auxiliary data in the form 
of a sequence of bits; evaluating the noise component of the 
digital representation of the host data; comparing elements 
of the host data with the noise component to determine pairs 
of the host elements having numerical values which differ by 

50 less than said value of said noise component; and replacing 
individual values of the elements with substantially equiva- 
lent values from said pairs of elements in order to embed 
individual bit values of the auxiliary data corresponding to 
the sequence of bits of the auxiliary data; and outputting the 

55 host data with the auxiliary data embedded into the host data 
as a file. 

In accordance with the purposes of the present invention 
there is further provided a method of extracting embedded 
auxiliary data from host data containing a noise component 

60 comprising the steps of extracting from the host data a bit 
sequence indicative of the embedded auxiliary data, and 
which allows for verification of the host data; interpreting 
the host data-element pairs which differ by less than the. 
value of the noise component and which correspond to bit 

63 values of the auxiliary data; identifying the auxiliary data bit 
sequence corresponding to the pair values; and extracting 
the auxiliary data as a file. 
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BRIEF DESCRIPTION OF THE DRAWINGS extra information. Through use of this invention, the extra 

information also can be extracted easily by an authorized 

The accompanying drawings, which are incorporated in enabled receiver of the data, 

and form a part of the specification, illustrate the embodi- Redundancy in digital image data occurs when a particu- 

ments of the present invention and, together with the 5 ^ nuin erical value of the digital intensity exists in many 

. description, serve to explain the principles of the invention. different parts of the image. Redundancy is found commonly 

In the drawings: in images because a given grey-level exists in the rendition 

FIG. 1 is a block diagram illustrating the processes used of trees, sky, clouds, people, and other objects. The presence 

in the embedding and extraction of data from a host. of noise in digital images permits the picture elements, 

FIG. 2 is a partial listing of computer code used for 10 ls ;J° vary slightly in numerical value. For 8-bit digital 

determining ho£ data pairs having sLlar values and for * e ^ aUmencal v * u , es ™& fr T ^ 255 ' As 

converting RGB components to HSI components. P uek hfving the same or similar numencd values represent 

6 ^ ^ a variety of image content, many values in different loca- 

FIG. 3 is a partial listing of computer code used for tioM q{ ^ can ^ interchanged freely. The image 

eliminating duplicate host data pairs. appearance and the statistical frequency of occurrence of a 

FIG. 4 is a partial listing of computer code which, for particular pixel value are affected little by the interchanging 

Truecolor images, introduces a constraint on the frequency 0 f the spatial position of pixels close in numerical value, 

of occurrence of host data pairs that minimizes the effect of Initially, from the original digital data (hereinafter often 

embedding on the host data histogram, referred to as the "host" data), the present invention first 

FIG. 5 is a partial listing of computer code that performs ^ converts the host data to digital form, if necessary, and then 

the actual embedding of auxiliary data into the host data, creates an image histogram to show the probability density 

including the considerable Mormation which is necessary to of numerical pixel values occurring in the image. The 

manipulate the data in the header information, auxiliary number of times a particular pixel value occurs in the image 

bit-stream, and the host data files. is plotted versus the value. For 8-bit digital data, the pixel 

FIG. 6 is a partial listing of computer code that analyzes # values from 0-255. Of course, the level of noise in an 

the lengths of runs in a row of pixels in two-color facsimile image will depend on the source of the data, with different 

host data. n °i se levels expected between photos, original artwork, 

FIG. 7is a partial listing of computer code whose purpose audio * and facsimile transmissions, 

is to ensure that the first pixel in a PACKET-START data The actual embedding of the auxiliary data into the host 

row starts in an even column number. The location of the 30 ^ a three-part process, the basic steps of which are 

first pixel in the row flags the start of the data packets. illustrated in FIG. 1. First, an estimate of the noise compo- 

m ^ 0 . - ,,. r * a £ . AA - nent of the host data is determined and used in combination 

nas^apa^l^gofcomputec^cfee^dding 0 f a histogram of the host data numerical 

data into two-color host images, such as facsmule tiansmis- ^ to ^ rf ^ ^ ^ ^ ^ ^ ^ 

S10ns * 35 with approximately the same statistical frequency, and that 

DETAILED DESCRIPTION differ in value by less than the value of the noise component 

Second, the position of occurrence of the pair values found 

The present invention allows data to be embedded into a is adjusted to embed the bitstream of the auxiliary informa- 

digital transmission or image without naturally discernible tion set Third, the identified pairs of values in the host data 

alteration of the content and meaning of the transmission or ^ are used to create a key for the extraction of the embedded 

image. This is made possible because of the technique of the data. 

present invention, in which similar pixel values in a set of Extracting embedded data inverts this process. The key 

digital host data are re-ordered according to the desired placed in the image in the embedding phase specifies the 

embedded or implanted information. The host data image pair-values which contain the embedded auxiliary informa- 

examples are represented in the MICROSOFT® BITMAP® 45 tion. With the pair- values known, extraction consists of 

(.BMP) format, so that the resulting image contains the recreating the auxiliary data according to the positions of 

embedded auxiliary information without that information pixels having the pair-values given in the key. The key data 

being readily discernible. are use d fast to extract header information. The header 

The MICROSOFT® BITMAP® image format is a information specifies the length and the file name of the 

public-domain format supporting images in the Truecolor, 50 auxiliary data, and serves to validate the key. If the image 

color palette, grey-scale, or black and white representations. containing embedded information has been modified, the 

Truecolor images have 24-bits per pixel element, with each header information will not extract data correctly. However, 

byte of the pixel element representing the intensity of the successful extraction recreates the auxiliary data exactly in 

red, green, and blue (RGB) color component. Color palette , an output file. 

images contain a table of the permitted RGB values. The 55 The principle of data embedding according to the present 

pixel value in a color palette image is an index to this table. invention involves the rearrangement of certain host data 

Grey-scale images give the numerical intensity of the pixel values in order to encode the values of the extra data which 

values. Black and white representation assigns either zero or i s to be added For the purposes of this description of the 

one as one of the two possible pixel values. The invention invention, consider a host data set represented by 8 bits of 

will be made understandable in the context of the BIT- 60 binary information, with values ranging between 0 and 255 

MAP® image types by reference to the following descrip- bits for each host data sample. Further, assume that the noise 

tion. value, N, for a signal, S, is given by N=±S/10, or approxi- 

At the point when most sensory obtained information is mately 10% of the signal value. For many data, the noise 

represented in digital form, whether it be from video, component can be approximated by a constant value or 

photographs, laboratory measurements, or facsimile 65 percentage, such as the 10% value used for this description, 
transmissions, the digital data contain intrinsic noise and Two values in the host data, d\ and d,-, are within the noise 

redundant information which can be manipulated to carry value if: 
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\drdfi*£N io used to force comparison of only the intensity of the two 

Hie frequency of occurrence or histogram value of a certain palette entries. Greyscale images do not require the RGB to 

value, <£, is f(d,). Data values meeting the criteria of Equa- HSI conversion made for color palettes, 

tion 10, and occurring in the host data with frequency of Th e embedding process of the present invention ignores 

occurrence f(d,)-f(<^)<5> where 5 is the tolerance imposed 5 differences in the saturation component of color palette 

for statistical equality, are candidates for embedding use. entries because saturation is ordinarily not noticeable in a 

The values, d* and dy, constitute a pair of data values, 00101 image. Only the Hue and Intensity components are 

There are b=0,l,2, . . . N p such pairs in the host data set, constrained to fall within fixed noise limits to determine the 

giving a total number of embedding bits, for each pair. palette pair values. 

10 Pixel pair values found by the code listed in FIG. 2 

Af*=z/W)+i,/&$) 20 toctode generally redundant values. The same pixel value, i, 

i j is found in several different pair combinations. Because 

. ^ ^ - . . * ^ * - multiple pairs cannot contain the same palette entry, due to 

where the summations for i and j run to the kmus of the ^ ^ combinatkm of pixel values h F aving to ^ uni 

frequency of occurrence in the data set, ftd,) and f(d,), for the it is ncccssar y to eliminate some pairs. The number of pairs 

respective data values. 15 located by applying the criterion of Equation 10 is stored in 

It is now helpful to refer to FIG. 2, wherein a partial listing the variable, no_pairs, in line 51. 

of computer code in the C-Language is printed. The deter- Referring now to FIG. 3, the code fragment listed therein 

ruination of the host data pixel pair values, d\ and dp in illustrates tile manner in which duplicate pairs are eliminated 

Equation 10, is accomplished through the code listed in FIG. by a separate routine. First, the histogram of the image is 

2. In FIG. 2, these 8 bit values are interpreted as indices in 20 used to calculate the total number of occurrences in each 

a color palette table. The comparison indicated in Equation pair, as required by Equation 20, above; Line 1 shows the 

10 is therefore required to be a comparison between the i-loop used to calculate the value, M*, for each pair. Next, 

corresponding colors in the palette. Entries in the color the pairs are sorted according to decreasing order of the 

palette are Red, Green, and Blue (RGB) color-component pairf]. count data-structure member in line 5. The elimination 

values, each within the range of 0-255. 25 of duplicates in the following line retains the pairs, p^ 

IF additional information is desired on the format used for having the largest total number of frequency values. M*. 

BITMAP® images, reference should be made to two Line 10 and the lines following calculate the total number of 

sources. One is the book, Programming for Graphics Files, bytes that can be embedded into the host data using the 

by J. Levine, 1994 (J. Wiley & Sons, New York). The other unique pixel pairs found by this code fragment 

is a technical article, l The BMP Format " by M. Luse, Dr. 30 Sorting the pair values in decreasing order of value; M*, 

Dobb's Journal, Vol. 19, Page 18, 1994. nuxumizes the number of pairs required to embed a parti cu- 

The code fragment in FIG. 2 begins at line 1 with a loop lar auxiliary data stream. However, the security of the 

running over the number of colors in the palette. The loop embedded data is increased significantly if the pair values 

index, i, is used to test each palette color against all other are arranged in random order. Randomizing the pair- value 

entries, in sequence, to identify pairs of color entries meet- 35 order is part of this invention. This is accomplished by 

ing the criteria established by Equation 10. Each color rearranging the pair-values to random order by calculating a 

identified in the i-loop then is tested against all other colors data structure having entries for an integer index pts[k] .i, 

in the palette by a second loop using another index, j, k=0,l,2, . . . , nojairs; and pts[k] .gamma=5 0 , 5 X , . . . 5„ 

starting at line 16. line 7 provides a modification for images ... S^j***, where the 5 r values are random. Sorting the data 

which have a palette for greyscale instead of colors. For 40 structure, pts[], to put the random values in ascending order 

greyscale images, the RGB components are identical for randomizes the index values. Hie random -index values are 

each palette entry, although some grey scale formats include used with the pair-values calculated as indicated above, to 

a 16-color table as welL re-order the table to give random pair-value ordering. 

The comparison indicated in Equation 10 is made by The algorithm described for palette-format images per- 
converting the Red, Green, and Blue (RGB) color compo- 45 mits manipulating pixel values without regard to the indi- 
nent values to corresponding Hue, Saturation, and Intensity vidual frequency of occurrence. Reference should now be 
(HS1) color components. Line 12 uses a separate routine, made to FIG. 4 where another code fragment is listed in 
rgbhsiQ, to effect this conversion. Line 20 converts RGB which, for Truecolor images, a constraint is introduced on 
color component values in the j-loop to HSI data structure the frequency of occurrence that minimizes the effect of 
components, and line 21 calculates the color difference in so embedding on the host data histogram, 
the HSI system. Line 24 then implements the test required Truecolor images consist of three individual 8-bit grey- 
by Equation 10. If the color difference is less than a fixed scale images, one each for the red, green, and blue image 
noise value (COLOR__NOISE=10 in the listing of FIG. 2), components. Truecolor images have no color palette. The 
the intensity difference is tested to determine if the two possible combinations of the three 8-bit components give 
palette entries are acceptable as differing by less than the 55 approximately 16 million colors. The present invention 
noise value specified. Two additional constraints are embeds data into Truecolor images by treating each RGB 
imposed before accepting the entries as candidate pair color component image separately. The effect of embedding 
values. First, the difference in color is required to be the on the composite image color is therefore within the noise 
smallest color difference between the test (i-loop) value, and value of the individual color intensity components, 
all the other (j-loop) values. Second, the number of pairs 60 In FIG. 4. the ip-loop starting in line 2 refers to the color 
selected (k) must be less than half the number of columns in plane (ip=0,l,2 for R,G,B). Tbe frequency of occurrence of 
a row of pixels in the image, in order for the pair-value key each numerical value (0 through 255) is given in the array, 
to be stored in a single row of pixels. This is an algorithmic hist_ values[], with the color plane histograms offset by the 
constraint, and is not required by the invention. quantity, ip*256, in line 7. The variable, fvalue[], holds the 

A data-structure array, pair[], is used to hold the values of 65 floating point histogram values for color-component, ip. 

candidate pairs (i,j) and their total frequency of occurrence, line 11 begins a loop to constrain the pairs selected for 

Mfc. If the image is a greyscale palette, the test at line 35 is nearly equal frequency of occurrence. Pixel intensities 
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within the noise limit, RANGE, are selected for comparison image-data buffer, image_jow[inrow]. line 32 tests for 

of statistical frequency. The tolerance, 8, for statistical output of embedded data (a completed row of pixels) to the 

agreement is fixed at 5% in line 17. This tolerance could be image-data file, and line 40 tests for completion of a pass 

adjusted for particular applications. through the image-data. One pass through the image-data is 

After all possible values arc tested for the constraints of 5 made for each of the pixel pairs, pair[k], fc=0,l,2 . . . Np. 

noise and statistical frequency, the pairs found are sorted in In line 47, the pair index is incremented. A temporary pair 

line 27, the duplicates are removed, the starting index is data-structure variable named Rvalue" is used to hold the 

incremented in line 31, and the search continued. A maxi- working pair values of the host data pixels being used for 

mum number of pairs again is set by the algorithmic embedding. line 60 provides for refreshing the image-data 

constraint that the i and j pair values must be less than 10 buffer, imagejow. 

one-half the number of pixels in an image row. As with The embedding test is made at line 72. If the image_j*ow 
palette-format images, the security of the invention includes [inrow] content equals the pair value representing a data- 
randomizing the pair-value entries. image bit of zero, no change is made, and the image-data 

Applying the statistical constraint niinimizes the host value remains pvaluei. However, if the bit-stream value is 

image effects of embedding the auxiliary data. If the is one, the image-data value is changed to equal pvalue.j. line 

tolerance, 5, is set at 0, each pair chosen will contain data 84 . treats the case for image-data values not equal to the 

values less than the noise value in intensity separation, and embedding pair value, pvaluei In this case, the bitindex 

occurring with exactly the same statistical frequency. Setting variable is decremented, because the data-image bit is not 

the tolerance at 5=5%, as in the code fragment of FIG. 4, yet embedded, and the image-data indices are incremented 

permits the acceptance of pixel pairs that are close in 20 to examine the next host data value, 

frequency, while still preserving most of the statistical The extraction of embedded data is accomplished by 

properties of the host data. Few, if any, pairs might be found reversing the process used to embed the auxiliary data- 

by requiring exactly the same frequency of occurrence. image bit-stream. A histogram analysis of the embedded 

The actual embedding of auxiliary data into a set of host image-data set will reveal the candidate pairs for extraction 

data consists of rearranging the order of occurrence of 25 for only the case where the individual statistical frequencies 

redundant numerical values. The pairs of host data values are unchanged by the embedding process. In the listings of 

found by analysis are the pixel values used to encode the FIGS. 2-5, the statistical frequencies are changed slightly by 

bit-stream of the auxiliary data into the host data. It is the embedding process. The pair table used for embedding 

important to realize that the numerical values used for can be recreated by analysis of the original image-data, but 

embedding are the values already occurring in the host data. 30 it cannot generally be recovered exactly from the embedded 

The embedding process of the current invention does not image-data. 

alter the number or quantity of the numerical values in the Additionally, as described above, the invention includes 

host data. randomizing the order of the pair-values, thereby increasing 

In the embedding process of the present invention, the greatly the amount of analysis needed to extract the ernbed- 
host data are processed sequentially. A first pass through the 35 ded data without prior knowledge of the pair-value order, 
host data examines each value and tests for a match with the As previously described, the ordered pairs selected for 
pixel-pair values. Matching values in the host data are embedding constitute the "key" for extraction of the data- 
initialized to the data-structure value, pair[k].i, for fc=0,l,2 . image from the image-data. The listings illustrated in FIGS. 
. . N p . This step initializes the host BITMAP® image (FIG. 2-5 demonstrate how embedding analysis reduces the sta- 
1) to the pair values corresponding to zeroes in the auxiliary 40 tistical properties of the noise component in host data to a 
data. A second pass through the auxiliary data examines the table of pairs of numerical values. The key-pairs are required 
sequential bits of the data to be embedded, and sets the for extraction of the embedded data, but they cannot be 
pair-value of the host data element to the value i or j, generated by analyzing the host data after the embedding 
according to the auxiliary bit value to be embedded. If the process is completed. However, the key can be recreated 
bit-stream being' embedded is random, the host data pair- 45 from the original, unmodified host data. Thus, data embed- 
values, i and j, occur with equal frequency in the host image ding is similar to one-time-pad encryption, providing 
"after the embedding process is completed. extremely high security to the embedded bit-stream. 

FIG. 5 illustrates the code fragment that performs the With the pair table known, extraction consists of sequen- 

actual embedding, including the considerable information tially testing the pixel values to recreate an output bit-stream 

which is necessary to manipulate the data in the header 50 for me header formation and the data-image. In me present 

information, auxiliary bit-stream, and the host data files. invention, the pair table is inserted into the host image-data, 

Lines 1-12 allocate memory and initialize variables. The where it is available for the extraction process. Optionally, 

header and bit-stream data to be embedded are denoted the the present invention permits removing the pair table, and 

"data-image," and are stored in the array, data_row[]. The storing it in a separate file. Typically, the pair table ranges 

host data are denoted the "image-data." 55 from a few to perhaps hundreds of bytes in size. The 

The index, li, is used in a loop begiiining at line 12 to maximum table size permitted is one-half the row length in 

count the byte position in the data-image. The loop begins pixels. With the pair table missing, the embedded data are 

with li=— 512 because header information is embedded secure, as long as the original host image-data are unavail- 

before the data-image bytes. Line 14 contains the test for able. Thus, the embedding method gives security potential 

loading data_jrow[] with the header information. Line 20 60 approaching a one-time-pad encryption method, 

contains the test for loading data_row[] with bytes from the Another way of protecting the pair table is to remove the 

data-image file, tape5. key and encrypt it using public-key or another encryption 

line 30 starts a loop for the bits within a data-image byte. process. The present invention permits an encrypted key to 

The variable, bitindex=(0,l,2 ... 7), counts the bit position be placed into the host image-data, preventing extraction by 

within the data-image byte, data_row (d_inrow], indexed 65 unauthorized persons, 

by the variable, d_inrow. The variable, lj. indexes the byte Embedding auxiliary data into a host slightly changes the 

(pixel) in the host image. The variable, inrow, indexes the statistical frequency of occurrence of the pixel values used 
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for encoding the bit-stream. Compressed or encrypted 
embedding data are excellent pseudo-random auxiliary bit- 
streams. Consequently, embedding auxiliary data having 
pseudo-random properties minimizes changes in the average 
frequency of occurrence of the values in the embedding 
pairs. Embedding character data without compression or 
encryption reduces significantly the security offered by the 
present invention. 

The existence of embedded data is not detected easily by 
analyzing the embedded image-data. When viewed as a 
cryptographic method, data embedding convolves the data- 
image with the image-data. Hie original data-image bit- 
stream embedded represents a plaintext. The combination of 
the host and embedded data implants ciphertext in the noise 
component of the host The existence of ciphertext is not 
evident however, because the content and meaning of the 
host carrier information is preserved by the present inven- 
tion. Data embedding according to the present invention is 
distinct from encryption because no obvious ciphertext is 
produced. 

Those who are unfamiliar with the terms "plaintext " and 
"ciphertext" can refer, for example, to B. Schneier, Applied 
Cryprography Protocols, Algorithms, and Source Code In C, 
J. Wiley & Sons, New York, N.Y., 1994. This reference is 
incorporated herein by reference. 

As mentioned previously, the present invention is useful 
in the embedding of auxiliary information into facsimile 
(FAX) data. In the previous discussion concerning embed- 
ding auxiliary information into image host data, the noise 
component originates from uncertainty in the numerical 
values of the pixel data, or in the values of the colors in a 
color pallet 

Facsimile transmissions are actually images consisting of 
black and white BITMAP® data, that is , the data from image 
pixels are binary (0,1) values representing black or white, 
respectively, and the effect of noise is to either add or 
remove pixels from the data. The present invention, 
therefore, processes a facsimile black-and-white BITMAP® 
image as a 2-color BITMAP®. 

The standard office FAX machine combines the scanner 
and the digital hardware and software required to transmit 
the image through a telephone connection. The images are 
transmitted using a special modem protocol, the character- 
istics of which are available through numerous sources. One 
such source, the User's Manual for the EXP Modem (UM, 
1993), describes a FAX/data modem designed for use in 
laptop computers. FAX transmissions made between com- 
puters are digital communications, and the data are therefore 
suited to data embedding. 

As has been previously discussed with relation to embed- 
ding into images, the FAX embedding process is conducted 
in two stages: analysis and embedding. In the case of a FAX 
2-color BITMAP®, image noise can either add or subtract 
black pixels from the image. Because of this, the length of 
runs of consecutive like pixels will vary. 

The scanning process represents a black line in the source 
copy by a run of consecutive black pixels in the two color 
BITMAP® image. The number of pixels in the run is 
uncertain by at least ±1, because of the scanner resolution 
and the uncertain conversion of original material to black- 
and-white BITMAP® format. 

Applying data embedding to the two color BITMAP® 
data example given here therefore consists of analyzing the 
BITMAP® to determine the statistical frequency of 
occurrence, or histogram, of runs of consecutive pixels. Hie 
embedding process of the present invention varies the length 
of runs by (0, +1) pixel according to the content of the 
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bit-stream in the auxiliary data- image. Host data suitable for 
embedding are any two color BITMAP® image which is 
scaled in size for FAX transmission. A hardcopy of a FAX 
transmission can be scanned to generate the two color 

5 BITMAP®, or the image can be created by using FAX- 
printer driver software in a computer. 

The FAX embedding process begins by analyzing the 
lengths of runs in each row of pixels. The implementation of 
this step is illustrated by the code fragment in FIG. 6. The 

10 arguments to the routine, rowstatsO are a pointer to the pixel 
data in the row, which consists of one byte per pixel, either 
a zero or a one in value; a pointer to an array of statistical 
frequencies; the number of columns (pixels) in the data row; 
and a flag for internal program options. The options flag is 

15 the size of blocks, or packets, of the auxiliary bitstream to be 
embedded. The options flag is tested in line 9, and the 
routine, packet_col() is used for a positive option flag. The 
packet_colO routine is given in the listing of FIG. 7, and its 
purpose is to ensure that the first pixel in the data row starts 

20 in an even column number. The location of the first pixel in 
the row flags the start of the data packets, which will be 
further described below. 

line 12 begins a loop to examine the runs of pixels in the 
data row. Runs between the defined values MINRUN and 

25 MAXRUN are examined by the loop. The j-loop, and the 
test at line 15, locate a run of pixels, and sets die variable, 
k, to the index of the start of the run. The test at line 21 
selects only blocks of pixels having length, i, less than the 
length of the row. The loop in line 22 moves the pixel run 

30 to temporary storage in the array blockQ. 

The two tests at lines 24 and 25 reject blocks having run 
lengths other than the one required by die current value of 
the i-loop. The embedding scheme selects blocks of length, 
i, for embedding by adding a pixel to make the length i+1. 

35 This assures that the run can contain either i or i+1 non-zero 
pixel values, according to the bit-stream of the auxiliary 
embedded data. If the run stored in the variable blockO array 
does not end in at least two zeroes, it is not acceptable as a 
run of length, i+1, and the code branches to NEXT, to 

40 examine the next run found. 

line 28 begins a loop to count the number of pixels in the 
run. The number found is incremented by one in line 31 to 
account for the pixel added to make the run length equal to 
i+1. line 33 contains a test ensuring that the run selected has 

45 the correct length. The histogram!] array for the run-length 
index, i, is incremented to tally the occurrence frequency of 
the run. The data row bytes for the run are flagged by the 
loop in line 36, with a letter code used to distinguish the runs 
located. This flagging technique permits the embedding 

so code to identify easily the runs to be used for embedding the 
bit-stream. On exit from this routine, the data row bytes 
contain runs flagged with letter codes to indicate the usable 
pixel positions for embedding the bit-stream. The return 
value is the number of runs located in the data row. A return 

55 of zero indicates no runs within the defined limits of 
MINRUN and MAXRUN were located. 

FAX modem protocols emphasize speed, and therefore do 
not include error-correction. For this reason, FAX transmis- 
sions are subject to drop-outs, to impulsive noise, and to lost 

60 data, depending on the quality of the telephone line and the 
speed of the transmission. For successful embedding, the 
present invention must account for the possible loss of some 
portion of the image data. To accomplish this, a variation of 
modem block-protocols is used to embed the header and the 

65 auxiliary data. The two color image is treated as a transmis- 
sion medium, with the data embedded in blocks, or packets, 
providing for packet- start flags, and parity checks. The start 



11/21/2003, EAST Version: 1.4.1 



5,659,726 

11 12 

of a packet is signaled by an image row having its first pixel the letter-code in line 58. If the test letter-code flag is 

in an even column. The packet ends when the number of bits located, line 60 advances the index in the row to the end of 

contained in the block are extracted, or, in the case of a the pixel run being used for embedding. The test function in 

corrupted packet, when a packet-start flag is located in a line. line 62 checks the value for the current bit index in the 

A checksum for parity, and a packet sequence number, are 5 bit-stream packet byte. If the value is one, the last pixel in 

embedded with the data in a packet. Using this method, the run is set to one. Otherwise, the last pixel in the run is 

errors in the FAX transmission result in possible loss of set to 0. 

some, but not all, of the embedded data. Setting the value of the pixel trailing a run implements the 
The amount of data lost because of transmission errors . embedding in the two color BITMAP® images by introduc- 
depends on the density of pixels in the source image and the ing noise generated according to the pseudo-random bit- 
length of a dropout Using 20 bytes per packet, a large stream in the packet data. The letter flag values written into 
dropout in transmission of standard text results in one or two the row buffer by the call to rowstats() in FIG. 8 are reset to 
packets of lost data Generally, the success of the invention binary unit value before the image_j , ow array data are 
follows the legibility of the faxed host image information. packed and written back to the .BMP format file. The 
Turning now to FIG. 7, there can be seen a listing of the process for doing this is not illustrated in FIG. 8, but is 
steps necessary to initialize the two color BITMAP® lines 15 straightforward for those skilled in the art 
to flag the start of each packet Each row in the two color Extraction of data embedded into a two color BITMAP® 
image contains a non-zero value beginning in an even FAX image, according to the present invention, can be 
column (packet start), or in an odd column (packet accomplished only if the transmission of the FAX is received 
continuation). by a computer. The image data are stored by the receiving 
In FIG. 7, it can be seen that line 4 starts a loop over the 20 computer in a file format (preferably a FAX compressed 
number of pixels in a data row. In FAX images, a zero (0) format), permitting the processing necessary to convert the 
pixel value indicates a black space, and a one (1) value image to BITMAP® format and to extract the embedded 
indicates a white space. line 5 locates the first black space data. FAX data sent to a standard office machine are not 
in the data for the row. If the variable, packet_size, is amenable to data extraction because the printed image is 
positive, the column index is tested to be even and the pixel 25 generally not of sufficient quality to allow forrecovery of the 
is forced to be a white space. If the packet_size variable is embedded data through scanning, 
negative, the routine returns an indicator of the data row flag However, the invention does apply to scanning^printing 
without making changes. If packet_sizeis greater than zero, FAX machines that process data internally with computer 
the first data row element is flagged as a white space. line haid ^- Auxiliary embedded data are inserted after the 
11 deals with the case in which packet size=0, indicating a 30 8C ™? n « of ^ but pdor A io Jra^mis^ion The 
contkuationrow. In the event of a con^uation row, the first "J"* S^5£ h * 
data row element is forced to a black space. The values "tES^ M h 
^e^ by^^ 

P ^,Ll^Z7'f v«*a 'mr« -j . r , f * e ** Ranged from the original (ia+1) values. The order 

TOe code fragment listed in FIG. 8 provides auxiliary data 35 in which me values are used depends on the frequency of 

embedding into two color BITMAP® FAX images. The occurrence in the image. As in the example for palette-color 

pixels in a row are processed as described above by exam- images, a key to the value and order of the pairs used for 

ining the contents of the data row after it has been analyzed embedding is inserted into the FAX. However, the key is not 

and flagged with letter codes to indicate the run lengths. strictly required, because, in principle, knowledge of (he 

lines 1 through 49 are part of a large loop (not shown) over 40 defined values MINRUN and MAXRUN permits 

the pixel index, lj, in the two color BITMAP® image. lines re-calculating the run-length statistics from the received 

1-26 handle the reading of one line of pixels from the two image. In practice, the key is required because transmission 

color BITMAP®, and store the row number of the image in errors in the FAX-modem communication link can introduce 

the variable, nrow, in line 1. The pixel value bits are decoded new run-lengths that alter the statistical properties of the 

and expanded into the image__row[] array in lines 12-56. 45 image, and because the pair ordering is not known. Even 

The image_row0 array contains the pixel values stored as though FAX embedding is somewhat less secure than 

one value (0 or 1) per byte. embedding auxiliary data into palette-color images, the two 

Line 28 uses the packet_col() routine to return the color BITMAP® FAX embedding of data still can be 

packet-index for the row. If j is 0 in line 28, the row is a regarded as similar to one-time-pad cryptography, 

packet-start row, and if j is 1, the row is a continuation row. 50 The foregoing^description of the preferred embodiments 

line 29 uses therowstats() routine to assign run-length letter of the invention have been presented for purposes of illus- 

flags to the pixels in the row buffer. The return value, i, gives (ration and description. It is not intended to be exhaustive or 

the number of runs located in the image row. Consistency to limit the invention to the precise form disclosed, and 

tests are made at lines 31, 37, and 41. The index, kp, gives obviously many modifications and variations are possible in 

the pixel row number within a data packet If kp is 0, the line 55 light of the above teaching. The embodiments were chosen 

must be a packet-start index, and if kp>0, the line must be and described in order to best explain the principles of the 

a continuation line. Line 49 completes the process of reading invention and its practical application to thereby enable 

and preprocessing a row of two color image data. others skilled in the art to best utilize the invention in various 

The data-structure array, pair[], contains the run length for embodiments and with various modifications as are suited to 

(i), the augmented run length, (i+1), and the total number of 60 the particular use contemplated. It is intended that the scope 

runs in the two color BITMAP® image. The index, k, in the of the invention be defined by the claims appended hereto, 

loop starting at line 51, is the index for the run lengths being What is claimed is: 

embedded. The index, inrow, counts pixels within the image 1. A method of embedding auxiliary data into host data 

row buffer, and the variable, bitindex is the bit-position comprising the steps of: 

index in the bit-stream byte. 65 creating a digital representation of said host data in the 

line 57 sets the value of the run-length letter-code in the form of elements having numerical values and contain- 

variable, testltr. The value of an image pixel is tested against ing a noise component; 



11/21/2003, EAST Version: 1.4.1 



5,6i 

13 

creating a digital representation of said auxiliary data in 
the form of a sequence of INDIVIDUAL bit VALUES; 

evaluating said noise component of said digital represen- 
tation of said host data; 

comparing pairs of said elements with said noise compo- 
nent to determine pairs of said elements having numeri- 
cal values which differ by less than said value of said 
noise component; 

replacing individual values of said elements with substan- 
tially equivalent values from said pairs of elements in 
order to embed individual bit values of said auxiliary 
data corresponding to said sequence of bit values of 
said auxiliary data; and 

outputting said host data with said auxiliary data embed- 
ded into said host data as a file. 

2. The method as described in claim 1 further comprising 
the step of combining said auxiliary data with predetermined 
information indicative of said auxiliary data, its file name, 
and file size, said step to be performed after the step of 
digitizing said auxiliary data. 
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3. The method as described in claim 1 further comprising 
the step of deterrnining a protocol for embedding said 
auxiliary data into said host data which allows for verifica- 
tion of said auxiliary data upon extraction from said host 

5 data. 

4. The method as described in claim L wherein said host 
data comprises a color photograph. 

5. The method as described in claim 1, wherein said host 
data comprises a black and white photograph. 

10 6. The method as described in claim 1, wherein said host 
data comprises a television signal. 

7. The method as described in claim 1, wherein said host 
data comprises a painting. 

8. The method as described in claim 1, wherein said host 
15 data comprises a facsimile transmission. 

9. The method as described in claim 1, wherein said host 
data comprises an identification card. 

10. The method as described in claim 1, wherein said host 
data comprises digital audio information. 

***** 
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